Module 6 - Resampling and Cross Validation

Overview

In traditional statistical modeling, model fits are evaluated using test statistics, hypothesis tests, or examination of the posterior distribution. These statistical tools are often not available for machine learning models because of their complexity. Instead, computational methods based on resampling have been developed which allow for estimation of uncertainty, out of sample accuracy (generalization), and model comparison. During this week we will begin our exploration of these tools by studying resampling, the boostrap, and cross-validation, which is one of the most crucial techniques for evaluating machine learning models.

Learning Objectives

  • Understand how to apply cross-validation to assess out of sample accuracy
  • Understand the trade-offs in different data splits
  • Apply the bootstrap to estimate uncertainty in predictions and parameters

Readings

  • ISLP (Introduction to Statistical Learning): Chapter 5

Videos### Videos### Videos